Goto

Collaborating Authors

 sample efficient exploration


Novelty Search in Representational Space for Sample Efficient Exploration

Neural Information Processing Systems

We present a new approach for efficient exploration which leverages a low-dimensional encoding of the environment learned with a combination of model-based and model-free objectives. Our approach uses intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space to gauge novelty.


Review for NeurIPS paper: Novelty Search in Representational Space for Sample Efficient Exploration

Neural Information Processing Systems

Additional Feedback: The method seems to be restricted to deterministic environments. Could we add a bit of discussion why it would be the case and how we could imagine to extend the approach to deal with stochastic environments (maybe in the supplementary material)? In most approaches, the discount factor is an exponential function of the distance in time, why did the authors choose to make it a function of state and action, and why should we learn it? Having the environment return the discount factor is not really common. The choice of the learned representation size seems to contain some domain knowledge.


Review for NeurIPS paper: Novelty Search in Representational Space for Sample Efficient Exploration

Neural Information Processing Systems

This paper proposes an novelty-search exploration method based on an encoding of the environment. Their method computes the novelty of a state in a learned representation embedding space and encourages the agent to optimize for this novelty using a combined model-free and model-based approach. Motivated by the information bottleneck principle, the embedding space is learned by maximizing compression while retaining an accurate dynamics model, resulting in compressing the environment into a small state space well-suited for novelty-based exploration. The experiments were also clear and well-motivated, on grid-type domains to evaluate state coverage, and also two control domains to evaluate the improvement of novelty search on the agent's ability to perform control tasks. I particularly enjoyed the learned abstract visualization of the labyrinth env in Figure 1.


Novelty Search in Representational Space for Sample Efficient Exploration

Neural Information Processing Systems

We present a new approach for efficient exploration which leverages a low-dimensional encoding of the environment learned with a combination of model-based and model-free objectives. Our approach uses intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space to gauge novelty. One key element of our approach is the use of information theoretic principles to shape our representations in a way so that our novelty reward goes beyond pixel similarity. We test our approach on a number of maze tasks, as well as a control problem and show that our exploration approach is more sample-efficient compared to strong baselines.